A Fast Topological Parallel Algorithm for Traversing Large Datasets
نویسندگان
چکیده
This work presents a parallel implementation of graph-generating algorithm designed to be straightforwardly adapted traverse large datasets. new approach has been validated in correlated scenario known as the word ladder problem. The induces same topological structure proposed by its serial version and also builds shortest path between any pair words connected words. implemented parallelism paradigm is Multiple Instruction Stream - Data (MIMD) test suite embraces 23-word instances whose intermediate were extracted from dictionary 183,719 (dataset). morph quality (the two input words) performance (CPU time) evaluated against original algorithm. generated optimal solution for each tested, that is, minimum connecting an initial final was found. Thus, there no negative impact on solutions comparing them with those obtained through ANG However, outstanding improvement considering CPU time required build solutions. In fact, up 99.85%, speedups greater than 2.0X achieved
منابع مشابه
Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملAGORAS: A Fast Algorithm for Estimating Medoids in Large Datasets
The k -medoids methods for modeling clustered data have many desirable properties such as robustness to noise and the ability to use non-numerical values, however, they are typically not applied to large datasets due to their associated computational complexity. In this paper, we present AGORAS, a novel heuristic algorithm for the k -medoids problem where the algorithmic complexity is driven by...
متن کاملFast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets
Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation ...
متن کاملA Fast Algorithm for Constructing Topological Structure in Large Data
Discovering and constructing the topological structure in data has attracted the attention within the community of data analysis. However, most methods developed so far are unsuitable for very large sets of data because of their computational difficulties. This paper presents a fast algorithm for constructing the inherent topological structure in large sets of data that might be noisy in order ...
متن کاملfast sffs-based algorithm for feature selection in biomedical datasets
biomedical datasets usually include a large number of features relative to the number of samples. however, some data dimensions may be less relevant or even irrelevant to the output class. selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. to this end, this paper presents a hybrid method of filter and wr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Information Technology and Computer Science
سال: 2023
ISSN: ['2074-9007', '2074-9015']
DOI: https://doi.org/10.5815/ijitcs.2023.01.01